Guard init startup against OpenTelemetry boot-loop conditions#3423
Merged
Conversation
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix boot loop due to opentelemetry error
Guard init startup against OpenTelemetry boot-loop conditions
May 24, 2026
Collapse the duplicated OTEL_SDK_DISABLED / opentelemetry-instrument branches in run_startup, start_bin_gunicorn, start_bin_watcher, and start_bin_sync_watcher into two small helpers: - otel_prefix: emits the wrapper as NUL-delimited argv tokens (for direct process invocation). - otel_prefix_str: emits the wrapper as a shell-string prefix (for embedding inside `watchfiles --target-type command`). Each call site becomes a single command instead of a 2- or 3-way branch with a fully duplicated command body. As a side effect, the watcher functions now also gain the `command -v opentelemetry-instrument` fallback that the gunicorn/startup paths added.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the container init script so OpenTelemetry instrumentation is only applied when enabled and available, avoiding startup loops when the wrapper cannot be used.
Changes:
- Adds helper functions to generate OpenTelemetry wrapper prefixes.
- Uses those helpers for startup, Gunicorn, watcher, and sync watcher launch paths.
- Falls back to direct execution when OTEL is disabled or the wrapper is unavailable.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Collapse `otel_prefix` and `otel_prefix_str` into a single nameref-based
helper. Watchfiles call sites embed the array as a shell-quoted prefix
via `${wrap[*]@q}`, which also fixes a quoting bug where an
`OTEL_SERVICE_NAME_PREFIX` containing a single quote would produce an
invalid command string and break the watcher.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All callers declare a fresh `local -a wrap=()` before invoking, so the in-function reset is unnecessary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gantoine
approved these changes
May 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
RomM could enter a restart loop during container boot when OpenTelemetry wrapping was applied to startup/Gunicorn under incompatible OTEL env/runtime conditions. This change makes OTEL instrumentation conditional so the service still boots cleanly when OTEL is disabled or unavailable.
Startup path hardening (
run_startup)startup.pydirectly whenOTEL_SDK_DISABLED=true.opentelemetry-instrumentis not present.Gunicorn launch hardening (
start_bin_gunicorn)Behavioral impact